Journal article
Crypt4GH: A file format standard enabling native access to encrypted data
A Senf, R Davies, F Haziza, J Marshall, J Troncoso-Pastoriza, O Hofmann, TM Keane
Bioinformatics | OXFORD UNIV PRESS | Published : 2021
Abstract
Motivation: The majority of genome analysis tools and pipelines require data to be decrypted for access. This potentially leaves sensitive genetic data exposed, either because the unencrypted data is not removed after analysis, or because the data leaves traces on the permanent storage medium. Results: : We defined a file container specification enabling direct byte-level compatible random access to encrypted genetic data stored in community standards such as SAM/BAM/CRAM/VCF/BCF. By standardizing this format, we show how it can be added as a native file format to genomic libraries, enabling direct analysis of encrypted data without the need to create a decrypted copy.
Grants
Awarded by European Commission
Funding Acknowledgements
This work was supported by the following grants: Wellcome (100956/Z/13/Z, 201535/Z/16/Z, 206194) [TMK, AS, RD], Strategic Focal Area "Personalized Health and Related Technologies (PHRT)" of the ETH Domain #2017-201 [JTP], European Joint Programme on Rare Diseases (EJP-RD) #825575 [FH], and NHMRC grant #1113531 and the Medical Research Future Fund [OH].